Skip to content

ARROW-5317: [Rust] [Parquet] impl IntoIterator for SerializedFileReader#4323

Closed
FabioBatSilva wants to merge 3 commits intoapache:masterfrom
FabioBatSilva:reader-into-iter
Closed

ARROW-5317: [Rust] [Parquet] impl IntoIterator for SerializedFileReader#4323
FabioBatSilva wants to merge 3 commits intoapache:masterfrom
FabioBatSilva:reader-into-iter

Conversation

@FabioBatSilva
Copy link
Contributor

This patch adds a row iterator that takes ownership of the reader
and implements IntoIterator for SerializedFileReader

See :
#4301, https://issues.apache.org/jira/browse/ARROW-5317

Copy link
Member

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @FabioBatSilva . It seems the newly introduced struct shares fairly amount of code with the existing RowIter. Instead, can we define a Either type like following:

enum Either<'a> {
    Left(&'a FileReader),
    Right(Box<FileReader>),
}

And then use this in RowIter<'a>:

file_reader: Option<Either<'a>>,

Then we just need to add one more method in RowIter that is similar to from_file but takes a Box<FileReader> instead.

Let me know what you think.

@FabioBatSilva
Copy link
Contributor Author

FabioBatSilva commented May 17, 2019

@sunchao I like the Either approach

But I don't know if we can impl IntoIterator given RowIter's lifetime .

impl IntoIterator for SerializedFileReader<File> {
    type Item = Row;
    type IntoIter = RowIter;
                    //^^^^^^^ expected lifetime parameter

    fn into_iter(self) -> Self::IntoIter {
        RowIter::from_file_reader(Box::new(self))
    }
}

Perhaps we should extract the common logic from RowIter but keep it as a different Iterator

@sunchao
Copy link
Member

sunchao commented May 17, 2019

Do you still need IntoIter in that case? For your example, you can just do:

    let it = vec.iter()
        .map(|p| {
            File::open(p).unwrap()
        })
        .map(|f| {
            SerializedFileReader::new(f).unwrap()
        })
        .flat_map(|reader| -> RowIter {
            RowIter::from_file_into(None, reader).unwrap()
        })

@FabioBatSilva
Copy link
Contributor Author

IntoIterator seems more idiomatic..
But for my use case RowIter::from_file_into(..) works fine..

Copy link
Member

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Thanks @FabioBatSilva !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments